Transfer language selection for zero-shot cross-lingual abusive language detection

نویسندگان

چکیده

We study the selection of transfer languages for automatic abusive language detection. Instead preparing a dataset every language, we demonstrate effectiveness cross-lingual learning zero-shot This way can use existing data from higher-resource to build better detection systems low-resource languages. Our datasets are seven different three families. measure distance between using several similarity measures, especially by quantifying World Atlas Language Structures. show that there is correlation linguistic and classifier performance. discovery allows us choose an optimal zero shot

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Zero-shot Cross Language Text Classifica-

Labeled text classification datasets are typically only available in a few select languages. In order to train a model for e.g news categorization in a language Lt without a suitable text classification dataset there are two options. The first option is to create a new labeled dataset by hand, and the second option is to transfer label information from an existing labeled dataset in a source la...

متن کامل

Image-Mediated Learning for Zero-Shot Cross-Lingual Document Retrieval

We propose an image-mediated learning approach for cross-lingual document retrieval where no or only a few parallel corpora are available. Using the images in image-text documents of each language as the hub, we derive a common semantic subspace bridging two languages by means of generalized canonical correlation analysis. For the purpose of evaluation, we create and release a new document data...

متن کامل

Cross-Lingual Lexico-Semantic Transfer in Language Learning

Lexico-semantic knowledge of our native language provides an initial foundation for second language learning. In this paper, we investigate whether and to what extent the lexico-semantic models of the native language (L1) are transferred to the second language (L2). Specifically, we focus on the problem of lexical choice and investigate it in the context of three typologically diverse languages...

متن کامل

Zero-Shot Learning Through Cross-Modal Transfer

This work introduces a model that can recognize objects in images even if no training data is available for the objects. The only necessary knowledge about the unseen categories comes from unsupervised large text corpora. In our zero-shot framework distributional information in language can be seen as spanning a semantic basis for understanding what objects look like. Most previous zero-shot le...

متن کامل

One-Shot Neural Cross-Lingual Transfer for Paradigm Completion

We present a novel cross-lingual transfer method for paradigm completion, the task of mapping a lemma to its inflected forms, using a neural encoder-decoder model, the state of the art for the monolingual task. We use labeled data from a high-resource language to increase performance on a lowresource language. In experiments on 21 language pairs from four different language families, we obtain ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Information Processing and Management

سال: 2022

ISSN: ['0306-4573', '1873-5371']

DOI: https://doi.org/10.1016/j.ipm.2022.102981